62 research outputs found

    Sparse Dowker Nerves

    Get PDF
    We propose sparse versions of filtered simplicial complexes used to compte persistent homology of point clouds and of networks. In particular we extend a slight variation of the Sparse \v{C}ech Complex of Cavanna, Jahanseir and Sheehy from point clouds in Cartesian space to point clouds in arbitrary metric spaces. Along the way we formulate interleaving in terms of strict 22-categories, and we introduce the concept of Dowker dissimilarities that can be considered as a common generalization of metric spaces and networks.Comment: 25 page

    Relative Persistent Homology

    Get PDF
    The alpha complex efficiently computes persistent homology of a point cloud X in Euclidean space when the dimension d is low. Given a subset A of X, relative persistent homology can be computed as the persistent homology of the relative ?ech complex ?(X, A). But this is not computationally feasible for larger point clouds X. The aim of this note is to present a method for efficient computation of relative persistent homology in low dimensional Euclidean space. We introduce the relative Delaunay-?ech complex Del?(X, A) whose homology is the relative persistent homology. It is constructed from the Delaunay complex of an embedding of X in (d+1)-dimensional Euclidean space

    Semi-conditional variational auto-encoder for flow reconstruction and uncertainty quantification from limited observations

    Get PDF
    We present a new data-driven model to reconstruct nonlinear flow from spatially sparse observations. The proposed model is a version of a Conditional Variational Auto-Encoder (CVAE), which allows for probabilistic reconstruction and thus uncertainty quantification of the prediction. We show that in our model, conditioning on measurements from the complete flow data leads to a CVAE where only the decoder depends on the measurements. For this reason, we call the model semi-conditional variational autoencoder. The method, reconstructions, and associated uncertainty estimates are illustrated on the velocity data from simulations of 2D flow around a cylinder and bottom currents from a simulation of the southern North Sea by the Bergen Ocean Model. The reconstruction errors are compared to those of the Gappy proper orthogonal decomposition method.publishedVersio

    Binary time series classification with Bayesian convolutional neural networks when monitoring for marine gas discharges

    Get PDF
    The world’s oceans are under stress from climate change, acidification and other human activities, and the UN has declared 2021–2030 as the decade for marine science. To monitor the marine waters, with the purpose of detecting discharges of tracers from unknown locations, large areas will need to be covered with limited resources. To increase the detectability of marine gas seepage we propose a deep probabilistic learning algorithm, a Bayesian Convolutional Neural Network (BCNN), to classify time series of measurements. The BCNN will classify time series to belong to a leak/no-leak situation, including classification uncertainty. The latter is important for decision makers who must decide to initiate costly confirmation surveys and, hence, would like to avoid false positives. Results from a transport model are used for the learning process of the BCNN and the task is to distinguish the signal from a leak hidden within the natural variability. We show that the BCNN classifies time series arising from leaks with high accuracy and estimates its associated uncertainty. We combine the output of the BCNN model, the posterior predictive distribution, with a Bayesian decision rule showcasing how the framework can be used in practice to make optimal decisions based on a given cost function.publishedVersio

    Kernelization for Finding Lineal Topologies (Depth-First Spanning Trees) with Many or Few Leaves

    Full text link
    For a given graph GG, a depth-first search (DFS) tree TT of GG is an rr-rooted spanning tree such that every edge of GG is either an edge of TT or is between a \textit{descendant} and an \textit{ancestor} in TT. A graph GG together with a DFS tree is called a \textit{lineal topology} T=(G,r,T)\mathcal{T} = (G, r, T). Sam et al. (2023) initiated study of the parameterized complexity of the \textsc{Min-LLT} and \textsc{Max-LLT} problems which ask, given a graph GG and an integer k0k\geq 0, whether GG has a DFS tree with at most kk and at least kk leaves, respectively. Particularly, they showed that for the dual parameterization, where the tasks are to find DFS trees with at least nkn-k and at most nkn-k leaves, respectively, these problems are fixed-parameter tractable when parameterized by kk. However, the proofs were based on Courcelle's theorem, thereby making the running times a tower of exponentials. We prove that both problems admit polynomial kernels with \Oh(k^3) vertices. In particular, this implies FPT algorithms running in k^{\Oh(k)}\cdot n^{O(1)} time. We achieve these results by making use of a \Oh(k)-sized vertex cover structure associated with each problem. This also allows us to demonstrate polynomial kernels for \textsc{Min-LLT} and \textsc{Max-LLT} for the structural parameterization by the vertex cover number.Comment: 16 pages, accepted for presentation at FCT 202

    Optimal sensors placement for detecting CO2 discharges from unknown locations on the seafloor

    Get PDF
    Assurance monitoring of the marine environment is a required and intrinsic part of CO2 storage project. To reduce the costs related to the monitoring effort, the monitoring program must be designed with optimal use of instrumentation. Here we use solution of a classical set cover problem to design placement of an array of fixed chemical sensors with the purpose of detecting a seep of CO2 through the seafloor from an unknown location. The solution of the problem is not unique and different aspects, such as cost or existing infrastructure, can be added to define an optimal solution. We formulate an optimization problem and propose a method to generate footprints of potential seeps using an advection–diffusion model and a stoichiometric method for detection of small seepage CO2 signals. We provide some numerical experiments to illustrate the concepts

    Targeted Maximum Likelihood Estimation for Dynamic and Static Longitudinal Marginal Structural Working Models

    Get PDF
    This paper describes a targeted maximum likelihood estimator (TMLE) for the parameters of longitudinal static and dynamic marginal structural models. We consider a longitudinal data structure consisting of baseline covariates, time-dependent intervention nodes, intermediate time-dependent covariates, and a possibly time-dependent outcome. The intervention nodes at each time point can include a binary treatment as well as a right-censoring indicator. Given a class of dynamic or static interventions, a marginal structural model is used to model the mean of the intervention-specific counterfactual outcome as a function of the intervention, time point, and possibly a subset of baseline covariates. Because the true shape of this function is rarely known, the marginal structural model is used as a working model. The causal quantity of interest is defined as the projection of the true function onto this working model. Iterated conditional expectation double robust estimators for marginal structural model parameters were previously proposed by Robins (2000, 2002) and Bang and Robins (2005). Here we build on this work and present a pooled TMLE for the parameters of marginal structural working models. We compare this pooled estimator to a stratified TMLE (Schnitzer et al. 2014) that is based on estimating the intervention-specific mean separately for each intervention of interest. The performance of the pooled TMLE is compared to the performance of the stratified TMLE and the performance of inverse probability weighted (IPW) estimators using simulations. Concepts are illustrated using an example in which the aim is to estimate the causal effect of delayed switch following immunological failure of first line antiretroviral therapy among HIV-infected patients. Data from the International Epidemiological Databases to Evaluate AIDS, Southern Africa are analyzed to investigate this question using both TML and IPW estimators. Our results demonstrate practical advantages of the pooled TMLE over an IPW estimator for working marginal structural models for survival, as well as cases in which the pooled TMLE is superior to its stratified counterpar

    gems: An R Package for Simulating from Disease Progression Models

    Get PDF
    Mathematical models of disease progression predict disease outcomes and are useful epidemiological tools for planners and evaluators of health interventions. The R package gems is a tool that simulates disease progression in patients and predicts the effect of different interventions on patient outcome. Disease progression is represented by a series of events (e.g., diagnosis, treatment and death), displayed in a directed acyclic graph. The vertices correspond to disease states and the directed edges represent events. The package gems allows simulations based on a generalized multistate model that can be described by a directed acyclic graph with continuous transition-specific hazard functions. The user can specify an arbitrary hazard function and its parameters. The model includes parameter uncertainty, does not need to be a Markov model, and may take the history of previous events into account. Applications are not limited to the medical field and extend to other areas where multistate simulation is of interest. We provide a technical explanation of the multistate models used by gems, explain the functions of gems and their arguments, and show a sample application

    GUBS: Graph-Based Unsupervised Brain Segmentation in MRI Images

    Get PDF
    Brain segmentation in magnetic resonance imaging (MRI) images is the process of isolating the brain from non-brain tissues to simplify the further analysis, such as detecting pathology or calculating volumes. This paper proposes a Graph-based Unsupervised Brain Segmentation (GUBS) that processes 3D MRI images and segments them into brain, non-brain tissues, and backgrounds. GUBS first constructs an adjacency graph from a preprocessed MRI image, weights it by the difference between voxel intensities, and computes its minimum spanning tree (MST). It then uses domain knowledge about the different regions of MRIs to sample representative points from the brain, non-brain, and background regions of the MRI image. The adjacency graph nodes corresponding to sampled points in each region are identified and used as the terminal nodes for paths connecting the regions in the MST. GUBS then computes a subgraph of the MST by first removing the longest edge of the path connecting the terminal nodes in the brain and other regions, followed by removing the longest edge of the path connecting non-brain and background regions. This process results in three labeled, connected components, whose labels are used to segment the brain, non-brain tissues, and the background. GUBS was tested by segmenting 3D T1 weighted MRI images from three publicly available data sets. GUBS shows comparable results to the state-of-the-art methods in terms of performance. However, many competing methods rely on having labeled data available for training. Labeling is a time-intensive and costly process, and a big advantage of GUBS is that it does not require labels.publishedVersio
    corecore